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Selection of antibody with some initial binding (step 1) 



I 



Initial design 

a) Identity positions where substitution is acceptable and 
choose substitution s to explore, (step 2) 

b) Design an initial small set of variants using experimental 
design methods, (step 3) 



Modify initial 
selection 
parameters based 
on performance 
(Step 09) 



Optionally add new 
substitutions from step 02 for 
inclusion in the new variant set 



Synthesize and test antibody 
variant set for target binding and 
viral neutralizing activity, (step 4) 



End-point 
reached 



Propose a new variant set 
based on the model, 
(step 7) 



iterate 




Derive sequence -activity relationships (step 5} 

Combine results from different sequence-function 
models, (step 6) 



Select the best variant(s) (step 8). 
Use sequences and activities of 
these variants to modify algorithms 
used for substitution selection (step 
9) and sequence-function modeling, 
(step 10) 



Modify methods for combining 

different sequence-function 
models based on performance 
(Step 10) 



Figure 2 
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■1 Select initial candidate antibody sequence(s) 




A: Substitutions from homologous sequences in framework & CDR 

• Identify framework sequences within a defined evolutionary distance (PAIVI units) 

• Reconstruct phylogenetic tree for framework region only 
RULE 1a: 

•Select a defined number of framework residues that have undergone advantageous change 
RULE 2a: 

• Select defined number of framework and defined number of CDR DositionQ with hinhoct 
mutability index 

RULE 3a: 

• Select M amino acid substitutions from sequences in the same Chothia class 
SCORE:: Weighted value for each rule satisfied 




B: Substitutions from tiomologous structures 
•Superpose homologous stmctures from PDB 
RULE 1b: 

•Calculate AG change for all single substitutions 

•Select changes with <a defined value Kcal/mol change in free energy 

RULE 2b- 

• Estimate mean RMSD for every window of a defined number of residues 
•Select framework sites with an RMSD greater than a defined value 
RULE 3b: 

• Identify changes found in homologous sequences 

•Select framework varying sites within a defined distance from CDR 
SCORE:: Weighted value for each rule satisfied 






C; Substitutions from substitution matrices 

•Calculate substitution matrix for aii framework regions and canonical classes, rank all possible 
single substitutions for favorability 
RULE 1c: 

•Select highest scoring substitutions for each matrix 
RULE 2c: 

• Rank all possible single substitutions for favorability using a universal substitution matrix 

•Select highest scoring positions 

SCORE: Weighted value for each rule satisfied 


1 « 




D: Substitutions from PCA analysis 

• Determine principal components of sequence variation in alignment of homologs 
RULE 1d: 

♦Group CDRs based on PCA of amino acid sequences in the CDR. 

• Select highest scoring CDR positions that differentiate antibody sequences by function 
RULE 2d: 

•Group CDR positions based on observed amino acid frequencies at each site 

• Rank the groups based on contributions to one or more principal components 

• Select the top groups of sites to vary. 
SCORE: Weighted value for each rule satisfied 


i 


E: Substitutions from Binding pociiet Analysis 
RULE 1e: 

• Select sites based where physico-chemical properties of residues are conserved in the 
pocket 
RULE 2e: 

'Select CDR changes derived from evolutionary models to correlate properties of amino 
acids. 

SCORE: Weighted value for each rule satisfied 



Figure 3 
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A: Calculate average weight vectors: 

a) Build sequence-function model from data 
in which rows (sequence + function) and/or 
columns (substitutions) are randomly 
omitted. Calculate weight vectors. 

b) Repeat 1.000 times. 

c) Calculate average values and standard 
deviations for weight vectors and rank in 
order of importance. 



Figure 4 



A: Calculate average weight vectors: 

a) Build sequence-function model from data 
in which rows (sequence + function) and/or 
columns (substitutions) are randomly 
omitted. Calculate weight vectors. 

b) Repeat 1 ,000 times. 

c) Calculate average values and standard 
deviations for weight vectors and ranl^ in 
order of importance. 



B: Calculate weight vectors from 
randomized data: 

a) Randomly associate sequence data with 
function data 

b) 6uild sequence-function model and 
calculate weight vectors. 

c) Repeat 1 ,000 times. 

d) Calculate average value and standard 
deviations for weight vectors from 
randomized data. 



C: Calculate number of standard deviations weight 
vector is above value from randomized data. 



Figure 5 
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E coli leader peptide 
-20 -10 -1 

MKKLLFAIPL WPFYSHSTM 



(SEQ ID NO. : 1) 



Proteinase K 
1 

APAVEQRSEA 



11 

APLIEARGEM 



21 

VANKYIVKFK 



31 

EGSALSALDA 



41 

AMEKISGKPD 



51 61 71 81 91 

HVYKNVFSGP AATLDENMVR VLRAHPDVEY lEQDAWTIN AAQTNAPWGL 

101 111 121 131 141 

ARISSTSPGT STYYYDSSAG QGSCVYVIDT GIEASKPEFE GRAQI-rv'^KTYY 

151 161 171 181 191 

YSSRDGNGHG THCAGTVGSR TYGVAKKTQL FGVKVIiDDNG SGQYSTIIAG 

201 - 211 221 231 241 

MDFVASDKNN RNCPKGWAS LSLGGGYSSS VNSAAARLQS SGVMVAVAAG 

251 261 271 281 291 

NNNADARNYS PASEPSVCTV GASDRYDRRS SPSNYGSVLD IFGPGTSILS 



301 

TWIGGSTRSI 



311 

SGTSMATPHV 



321 

AGLAAYLMTL 



331 

GKTTAASACR 



341 

YIADTANKGD 



351 361 371 

liSNIPFGTVN LLAYNNYQAV DHHHHHH (SEQ ID NO. : 2) 



Figure 6 
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-60 -50 -40 

atgaaaaaac tgctgttcgc gattccgctg 

1 11 21 

GCACCGGCCG TTGAACAGCG TTCTGAAGCA 

61 71 81 

GTAGCAAACA AGTACATCGT GAAGTTCAAG 

121 131 141 

GCTATGGAAA AGATCTCTGG CAAGCCTGAT 

181 191 201 

GCAGCAACTC TGGACGAGAA CATGGTCCGT 

241 251 261 

ATCGAACAGG ACGCTGTGGT TACTATCAAC 



-30 -20 -10 -1 

gtggtgccgt tctatagcca tagcaccatg 

31 41 51 

GCTCCTCTGA TTGAGGCACG TGGTGAAATG 

91 101 111 

GAGGGTTCTG CTCTGTCTGC TCTGGATGCT 

151 161 171' 

CACGTCTATA AGAACGTGTT CAGCGGTTTC 

211 221 231 

GTACTGCGTG CTCATCCAGA CGTTGAATAC 

271 281 291 

GCGGCACAGA CTAACGCACC TTGGGGTCTG 



301 

GCACGTATTT 
361 

CAAGGTTCTT 



311 

CTTCTACTTC 
371 

GCGTTTACGT 



321 

CCCGGGTACG 
381 

GATCGATACG 



331 

TCTACTTACT 
391 

GGCATCGAGG 



341 

ACTACGACGA 
401 

CTTCTCATCC 



351 

GTCTGCCGGT 
411 

TGAGTTTGAA 



421 431 441 451 461 471 

GGCCGTGCAC AAATGGTGAA GACCTACTAC TACTCTTCCC GTGACGGTAA TGGTCACGGT 

481 491 501 511 521 531 

ACTCATTGCG CAGGTACTGT TGGTAGCCGT ACCTACGGTG TTGCTAAGAA AACGCAACTG 

541 551 561 571 581 591 

TTCGGCGTTA AAGTGCTGGA CGACAACGGT TCTGGTCAGT ACTCCACCAT TATCGCGGGT 

601 611 621 631 641 651 

ATGGATTTCG TAGCGAGCGA TAAAAACAAC CGCAACTGCC CGAAAGGTGT TGTGGCTTCT 

661 671 681 691 701 711 

CTGTCTCTGG GTGGTGGTTA CTCCTCTTCT GTTAACAGCG CAGCTGCACG TCTGCAATCT 

721 731 741 751 761 771 

TCCGGTGTCA TGGTCGCAGT AGCAGCTGGT AACAATAACG CTGATGCACG CAACTACTCT 

781 791 801 811 821 831 

CCTGCTAGCG AGCCTTCTGT TTGCACCGTG GGTGCATCTG ATCGTTATGA TCGTCGTAGC 

841 851 861 871 881 891 

TCCTTCAGCA ACTATGGTTC CGTCCTGGAT ATCTTCGGCC CTGGTACTTC TATCCTGTCT 



Figure 7A 
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901 911 921 931 941 951 

ACCTGGATTG GCGGTAGCAC TCGTTCCATT TCCGGTACGA GCATGGCTAC TCCACATGTT 

961 971 981 991 1001 1011 

GCTGGTCTGG CAGCATACCT GATGACCCTG GGTi^AGACCA CTGCTGCATC CGCTTGTCGT 

1021 1031 1041 1051 1061 1071 

TACATCGCGG ATACTGCGAA CiW^iGGCGAT CTGTCTAACA TCCCGTTCGG CACCGTTAAT 

1081 1091 1101 1111 1121 1131 

CTGCTGGCAT ACAACAACTA TCAGGCTgtc gaccatcatc atcatcatca tag 

(SEQ ID NO.: 3) 



Figure 7B 
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gi| 19171215 I einb| CAD2 057B .1 I /89 
gi| 19171217 |emb|CAD20579.l| /l- 
gi| 19171219 1 emb I CAD20580.1 1/1- 
gi 1 19171221 1 emb | CAD20581 . 1 1 /I- 
gi 1 16215662 I embl CAC95042 . 1 1 /90 

gi I 16506136 |dbj |BAB70705 .1 1 /7B 
gi 1 16506134 I dbj | BAB70704 . 1 1 /78 
gi 1 16506140 1 dbj |BAB70707.l|/7a 
gi 1 16215677 | emb | CAC95049 . 1 1 /90 
gi 1 117631 1 sp | P29138 | CODP_METAN 
gi| 6624958 |etnb|CAB63911-l I /90- 
gi 1 16215669 |eiTib|CAC95045 . 1 1 /90 
gi I 460033 I gb I AAA91584.1 1/84-36 
gi|66344 75 teIllb|CAB64346.l|/a7- 
gi 1 16215664 I emb I CAC95043 . 1 1 /87 
gi I 235138 6 |gb|AAC49831 .1 1 /86-3 
gi I 8 671180 | emb 1 CAB95012 . 1 1 / 8 5 - 
gi 1 16215666 |eiiib|CACd5044 .1 1/85 
gi 1 16215671 1 emb I CAC95046 . 1 I /85 
gi |4092486 |gb|AAC99421.l|/64-2 
9i| 18S4 242 9|gb|AAL75579.llAF46 
SUTIKA/91-367 

gi 1 131077 I spl P06873 | PRTK_TRIAL 
gi|23 0675|pdb|2PRK| /1-277 
gi I 494434 |pdb| 1PEK|E/1-277 
gi|224977|prf I | 1205229A/1-275 
gi 1 14278658 I pdb I 1IC6 | A/ 1-277 
gi I 131084 1 sp| P23653 | PRTR_TRrAL 
gi I 4761119 I gb| AAD29255 . 1 1 AF104 
gij 14626933 |gb|AAK70804. l|/81- 
gi| 63 9712|gb|AAC4a979. 11/83-34 
gi| 742825 I prf I | 20111 B4A/84 -362 
gi| 62B05l|pirj | JC214 2 / 84 -362 
gl| 1560879l|gb|AAIiOa502.l|AF41 
gi| 15808805lgb|AAI*08509.l|AF41 
gi I 28918475 I gb I EAA28148 . 1 1 /90- 
gi 1 10181226 j gb | AAC27316 . 2 | /92- 
gi I 131088 I sp|P20015| PRTT_TRIAIj 
gi I 9971109 I emb I CAC07219 . 1 1 /86- 
gi I 7543 916 | emb| CAB87194 . 1 1 /89- 
gi I 5813790 | gb | AAD52013 . 1 1 AF082 
gi I 23894244 | emb | CAD23614 . 1 1 /II 
gi|22652141 jgb |AAN03634 . 1 | AF40 
gi| 24528136 I emb I CAD24010.1 1/10 
gi I 24528132 | emb | CAD24008 . 1 1 /lO 
A35742./126-403 

gi 1 114 081 1 sp |P085g4 I AQL1_THEAQ 
AAAe2980./129-40e 

gi|l5640187 |ref 1NP_229814.1|/1 
AAA22247./107-3 61 
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VariationScore Primary contribution to score 

N95C 5 Structural stability at higher temperature: from published literature 

P97S 3 P to S for flexibility and structural perturbabtion 

S107D 5 from active homologs 

S 1 23 A 7 Thermostable consensus 

E138A 5 From experiments in literature 

M145F 5 From experiments to improve thermostability 

Y 1 5 1 A 8 From experiments to improve thermostability 

V 1 671 1 0 Allow user specified conservative changes (controlled perturbation) 

L 1 801 10 Allow user specified conservative changes (controlled perturbation) 

Yl 94S 10 Varaiation observed in highly active clone from our initial exp. 

A 1 99S 8 Allow user specified conservative changes (controlled perturbation) 

K208H 7 PC A modelling of homologs collected from GenBank. 

A236V 7 PC A modelling of homologs collected from GenBank. 

R237N 5 From experiments to improve thermostability (in literature) 

P265S 3 P to S for flexibility and structural perturbabtion 

V267I 10 Allow user specified conservative changes (controlled perturbation) 

S273T 15 Multiple sources identify this change, (thermostability and other) 

G293A 8 For thermostability considerations (observed in thermitases) 

L299C 5 For disulphide bridges with N95C ( from literature) 

131 OK 5 from structural studies 

K332R 8 for thermostability considerations (observed in thermitases) 

S337N 8 for thermostability considerations (observed in thermitases) 

P355S 3 P to S for flexibility £ind structural pertmbabtion 

Figure 13 
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variant-l: 123, 151, 293, 310, 332, 355 
variant-2: 95, 145, 167, 199, 237, 273 
variant-3: 97, 138, 180, 194, 236, 267 
variant-4: 107, 132, 208, 265, 299, 337 
variant-5: 123, 145, 151, 167, 273, 337 
variant-6: 97, 107, 180, 236, 237, 310 
variant-7: 123, 138, 199, 208, 265, 355 
variaiit-8: 95, 194, 267, 293, 299, 332 
variant-9: 95, 132, 138, 145, 167, 208 
variant-10: 236, 237, 273, 293, 332, 355 
variant-l 1: 97, 123, 265, 299, 310, 337 
variant-12: 107, 151, 180, 194, 199, 267 
variant-13: 95, 107, 123, 180, 194, 337 
variant-14: 138, 151, 167, 199, 208, 299 
variant-15: 97, 145, 237, 273, 293, 310 
variant-16: 132, 236, 265, 267, 332, 355 
variant-17: 97, 151, 199, 236,299, 355 
variant-18: 95, 107, 167, 180,293, 310 
variant- 19: 145, 237, 265, 267, 332, 337 
variant-20: 123, 132, 138, 194, 208, 273 
variant-21: 123, 208, 236, 267, 293, 299 
variant-22: 107, 132, 138, 145, 337, 355 
variant-23: 97, 180, 194, 199,265, 310 
variant-24: 95, 151, 167, 237, 273, 332 

Figure 14 
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Variant # 


Changes 
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97 
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variant-3 1 
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variant-32 
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variant-33 


107, 123, 145 


variant-34 


151, 167, 180 


variant-3 5 


194, 199, 267 


variant-36 


273, 293,310 


variant-3 7 


332, 337, 355 


variant-3 8 


107, 151, 194, 273,332 


variant-3 9 


123, 167, 199, 293,337 


variant-40 


145, 180, 267,310,355 


variant-4 1 


107, 167, 267, 273, 337 


variant-42 


123, 180, 194, 293,355 


variant-43 


145, 151, 199,310, 332 


variant-44 


145, 167, 194 


variant-45 


180, 199, 273 


variant-46 


267, 293, 332 


variant-47 
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variant-48 
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Reasons 

Confirm detrimental effect on enzyme 
Confirm detrimental effect on enzyme 
Confirm detrimental effect on enzyme 
Confirm detrimental effect on en2yme 
Confirm detrimental effect on enzyme 
Confirm detrimental effect on enzyme 
Confirm detrimental effect on enzyme 
Confirm detrimental effect on enzyme 
New combinations of positive changes 
New combinations of positive changes 
New combinations of positive changes 
New combinations of positive changes 
New combinations of positive changes 
New combinations of positive changes 
New combinations of positive changes 
New combinations of positive changes 
New combinations of positive changes 
New combinations of positive changes 
New combinations of positive changes 
New combinations of positive changes 
New combinations of positive changes 
New combinations of positive changes 
New combinations of positive changes 
New combinations of positive changes 
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Afign lm-1 heavy chain sequence with human germline tg heavy chain 
sequences from VBase using ClustalW. 



1 



A: Substitutions set 
RULE 1a: 

• Enumerate and classify the substitutions into 2 categories, (i) Substitutions found 
the framework region and (ii) substitutions found in the CDR. 

• Consider only these susbtitutions (ie RULE 1a is a filter) 



in 



B: Substitutions from human germiine sequences 

• Reconstruct phylogenetic tree 
RULE lb: 

• Calculate evolutionary proximity of the closest homoiog in which each 
substitution occurs (EP) 

RULE 2b: 

• Calculate site heterogeneity at each substitution position (SH) 
RULE 3b: 

• Calculate entropy at each substitution position (SE) 
RULE 4b: 

• Calculate number of times a substitution is seen at a position in the set of 
homologs (SN) 



C; Substitutions from substitution matrices 
RULE 1c: 

• Calculate favorability of each substitution using a PAM100 matrix (SM). 



D: Score 

Scorep^ = f(EP) X f(SH) x f(SE) x f(SN) x f(SM) 
ScorecDR = ^(SE) X f (SN) x f (SM) 



Figure 22 
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Align RSV-19 heavy chain sequence with human gemiline ig heavy chain 
sequences from VBase using ClustalW. 



A: Substitutfons set 
RULE 1a: 

• Enumerate and classify the substitutions into 2 categories, (i) Substitutions found 
the framework region and (ii) substitutions found in the CDR. 

• Consider only these susbtitutions (ie RULE 1a is a filter) 



in 



B: Substitutions from human germline sequences 

• Reconstruct phylogenetic tree 
RULE 1b: 

• Calculate evolutionary proximity of the closest homolog in which each 
substitution occurs (EP) 

RULE 2b: 

• Calculate site heterogeneity at each substitution position (SH) 
RULE 3b: 

• Calculate entropy at each substitution position (SE) 
RULE 4b: 

• Calculate number of times a substitution is seen at a position In the set of 
homoiogs (SN) 



C; Substitutions from substitution matrices 
RULE 1c: 

• Calculate favorability of each substitution using a PAM100 matrix (SM). 



D: Score 

Scorcpw = f(EP) X f(SH) x f(SE) x f(SN) x f(SM) 
Score^^R - f (SE) x f (SN) x f (SM) 



Figure 23 
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